Raoul Grouls
timeseries, 3 years monthly data
that’s just 150 datapoints…
solution keep your model as simple as possible
COMPAS algorithm, used in US to predict recidivism for criminals
Algorithm turned out to be racist: Blacks where misclassified twice as much
Cause: a questionnaire was used, in which the bias of society itself was reflected. E.g. both blacks and whites smoke as much marihuana (as can be measured in sewer water), but blacks are 10 times as likely to get a criminal record for this.
solution test for bias on subgroups. Understand simpsons’s paradox.
There is a correlation between weatherdata and sales. But the client needs predictions 8 weeks ahead.
How do you get realistic predictions 8 weeks ahead?
Answers:
you don’t
you create a linear regression model that predicts weather 8 weeks ahead
solution be critical towards features. More is not always better. Talk to domain experts.
If you have 10.000 genes, but only 150 patients and you need to predict 3 types of cancer, you have too much features.
solution prune features, reduce dimensionality, regularize your model
Symptom: your data performs perfect in training, and fails in real life
Cause: your model is too complex and “memorized” your trainingset.
Question: How big is the observationspace (the number of possible observations) if you have 25 features, each either a 0 or a 1?
Answer: \(2^{25}=3.3 * 10^7\)
solution learn to calculate the size of your searchspace and observationspace
This is an active area of research. A lot of companies are busy moving their data to the cloud.
When that is done, they often start using algorithms.
And then the question arises: is this fair? And how do we know?
We will look at Causal Bayesian Networks.
Networks are represented as a graph. A graph \(G\) is a pair \(G = (V, E)\) where
Causal Bayesian Networks are popular tools used by the machine learning community to think about fairness.
The arrow between nodes represent a causal relationship. The Bayesian part means every relationship has an associated probability.
Assume a college admisision scenario where: students with gender \(G\) are admitted \(A=\{0, 1\}\) based on
While this might sound very reasonable, and can be a very conscise way of talking about the complex networks of influences, there is an important caveat: the result is not what you would expect it to be.
We then show, analytically and empirically, that [causal definitions of fairness] almost always result in strongly Pareto dominated decision policies1
Pareto dominated is a technical way of saying: there is a better choice for everyone involved.
… policies constrained to satisfy causal fairness definitions would be disfavored by every stakeholder
… there is an alternative feasible policy that simultaneously achieves greater student-body diversity and higher college degree attainment
… we prove the resulting policies require admitting all students with the same probability, regardless of academic qualifications or group membership
The combination of the Simpson’s paradox and the fact that causal definition of fairness result in unfavorable solutions is problematic.
Chiappa, S., & Isaac, W. S. (2018, August). A causal Bayesian networks viewpoint on fairness. In IFIP International Summer School on Privacy and Identity Management (pp. 3-20). Springer, Cham.
Nilforoshan, H., Gaebler, J. D., Shroff, R., & Goel, S. (2022, June). Causal conceptions of fairness and their consequences. In International Conference on Machine Learning (pp. 16848-16887). PMLR.